Integrating Provenance Data from Distributed Workflow Systems with ProvManager
نویسندگان
چکیده
Running scientific workflows in distributed environments is motivating the definition of provenance gathering approaches that are loosely coupled to the workflow execution engine. This kind of approach is interesting because it allows both storage and access to provenance data in an integrated way, even in an environment where different workflow management systems work together. Therefore, we have proposed a provenance gathering strategy that is independent from the workflow system technology. This strategy has evolved into a provenance management system named ProvManager. In this paper we show how provenance data is captured along in a distributed execution environment with ProvManager and we show its web interface, in which scientists can register experiments, monitor workflow execution, and query provenance data.
منابع مشابه
Managing Provenance in Scientific Workflows with ProvManager
Running scientific workflows in distributed environments is motivating the definition of provenance gathering approaches that are loosely coupled to the workflow systems. We have proposed a provenance gathering strategy that is independent from workflow system technology. This strategy has evolved into a provenance management system named ProvManager. The main principle is that each workflow ac...
متن کاملChallenges in Managing Implicit and Abstract Provenance Data: Experiences with ProvManager
Running scientific workflows in distributed and heterogeneous environments has been motivating the definition of provenance gathering approaches that are loosely coupled to workflow management systems. We have developed a provenance management system named ProvManager to manage provenance data in distributed and heterogeneous environments independent of a specific Scientific Workflow Management...
متن کاملProvenance Collection Support in the Kepler Scientific Workflow System
In many data-driven applications, analysis needs to be performed on scientific information obtained from several sources and generated by computations on distributed resources. Systematic analysis of this scientific information unleashes a growing need for automated data-driven applications that also can keep track of the provenance of the data and processes with little user interaction and ove...
متن کاملIntegrating distributed data grid, ontology and Web-based workflow technologies into geospatial cyberinfrastructure: system design and case study
Geospatial research increasingly relies on shared geospatial data, interconnected models and successively refined analysis which requires not only more powerful but also more accessible cyberinfrastructure systems for support. In this study, we propose to integrate data grid, ontology and Web-based workflow technologies to build more accessible cyberinfrastructure systems for geospatial computi...
متن کاملProject Histories: Managing Data Provenance Across Collection-Oriented Scientific Workflow Runs
While a number of scientific workflow systems support data provenance, they primarily focus on collecting and querying provenance for single workflow runs. Scientific research projects, however, typically involve (1) many interrelated workflows (where data from one or more workflow runs are selected and used as input to subsequent runs) and (2) tasks between workflow runs that cannot be fully a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010